ASYMPTOTIC NORMALITY OF HILL ESTIMATOR FOR 

TRUNCATED DATA 



ARIJIT CHAKRABARTY 

Abstract. The pro blem of estimating the tail index from truncated 
data is addressed in IChakrabartv and Samorodnitskvl l|2009). In that 
paper, a sample based (and hence random) choice of k is suggested, 
and it is shown that the choice leads to a consistent estimator of the 
inverse of the tail index. In this paper, the second order behavior of the 
Hill estimator with that choice of k is studied, under some additional 
assumptions. In the untruncated situation, it is well known that as- 
ymptotic normality of the Hill estimator follows from the assumption of 
second order regular variation of the underlying distribution. Motivated 
by this, we show the same in the truncated case in light of the second 
order regular variation. 



1. Introduction 

Distributions with a regularly varying tail are becoming increasingly im- 
portant in nature. Lots of phenomena arising in fields like telecommuni- 
cations, finance and insurance exhibit the presence of such distributions. 
Historically, one of the most important statistical issues related to distribu- 
tions with regularly varying tail is estimating the tail index a. A detailed 
discussion on estimators of the tail index can be found in Chapter 4 of 
de Haan and Ferreira ( 20061) . One of the most popular estimators is the Hill 



estimator, introduced by Hill ( 19751 ). For a one-dimensional non-negative 



sample X±, . . . , X n , the Hill statistic is defined as 
(1.1) h(k,n): |»hT 



where X^ > . . . > X^ are the order statistics of X\, . . . , X n , and 1 < k < 
n is an user determined parameter. It is well known that if X\, . . . , X n are 
a i.i.d. sample from a distribution whose tail is regularly varying with index 
—a and k satisfies 1 <C k <C n, then h ik, n) consistently estimates a" 1 . In 
a sense made precise bv iMasonl (|l982f l. the consistency of Hill statistic is 
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equivalent to the regular variation of the tail of the underlying distribution. 
Various authors have studied the second order behavior of the Hill estima- 



tor; see for example Davis and Resnicld (119841) . ICsorgo a nd Mas onl .. 
Haeusler and Teutrelsl dl98fih. iGoldie and Smith! (|l9S7h . iGeluk et al.l l[l997 
and de Haan and Resnickl ( 19981 ) among others. It is well known that if the 
tail of the i.i.d. random variables X±, . . . ,X n satisfies a stronger assump- 
tion than regularly varying with index —a, known as second order regular 
variation, then 



Vk 



h(k, n) 

a 



N 



0, A 



While there are real life phenomena that do exhibit the presence of heavy 
tails, in lot of the cases there is a physical upper bound on the possible values. 
For example most internet service providers put an upper bound on the size 
of a file that can be transferred using an internet connection provided by 
them. Clearly the natural model for such phenomena is a truncated heavy- 
tailed distribution, a distribution which fits a heavy-tailed distribution till a 
certain point and then decays significantly faster. This can be made precise 
in the following way. Suppose that H,H±,... are i.i.d. random variables 
so that P(H > •) is regularly varying with index —a, a > and that 
L, Li, L2, ■ ■ ■ are i.i.d. random variables independent of (H, Hi, H2, ■ ■ ■)■ All 
these random variables are assumed to take values in the positive half line. 
We observe the sample X\, . . . , X n given by 

(1.2) Xj := Hjl(Hj < M n ) + (M n + L 3 )1{H 3 > M n ) , 

where M n , representing the truncating threshold, is a sequence of positive 
numbers going to infinity. Strictly speaking, the model is actually a triangu- 
lar array {X n j : 1 < j < n}. However, in practice we shall observe only one 
row of the triangular array, and hence we denote the sample by the usual 
notation Xi, . . . ,X n . The random variable L can be thought of to have a 
much lighter tail, a tail decaying exponentially fast for example. However 
the results of this article are true under milder assumptions. 

It was observed in Chakrabarty and Samorodnitsky (2009) that if the 
sequence M n goes to infinity slow enough so that 

(1.3) lim nP(H > M n ) = 00 , 

then a priori choosing a k so that the Hill estimator is consistent is a problem. 
In order to overcome that problem, the following sample based choice of k 
was suggested in that paper: 



(1.4) 



n 



P' 



'j > 7*(i)) 



where /3, 7 € (0, 1) are user determined parameters. It has been shown in 
that article that this choice of k n leads to a consistent estimator of a -1 when 
(|1.3j) is true, or when that limit is zero. In this paper, we investigate the 
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second order behavior of h(k n ,n) under the assumption f 1 1 . 3 [) and some ad- 
ditional assumptions. We hope to address the case when the corresponding 
limit is zero in future. 

In Section [21 it is shown that under some assumptions, the Hill estimator 
with k = k n is asymptotically normal with mean 1/a. In Section [3l we 
connect the assumptions of Section [2] to the second order regular variation 
of the tail of H. In Section [H we comment on the issues related to using the 
results of sections [2] and [3] in practice, and suggest ways for getting around 
some of them. 



2. Asymptotic normality of the Hill estimator 

Suppose that we have a one-dimensional non-negative sample X±, . . . 
given by (11, 2h . We shall assume the following throughout this section. 
Assumption A: There exists a sequence (e n ) such that 

(2.1) lim P(H > M n ) - ( 1- #e n 

n— too 

(2.2) lim nP(H > M n )P(L > e n M n ) 



0, 
0, 



(2.3) and lim P(H > M n )-^ V ^f)^ ~ 

where l{x) := x a P(H > x). 

Assumption B: limn^oo nP{H > M n ) = oo. 
Assumption C: lim^oo nP(H > M n ) 2_/3 (log M n ) 2 
Assumption D: For any sequence (v n ) satisfying 

(2.4) v n ~ nP{H > ^M n f , 
it holds that 

r n 



1 



0. 



0. 



P[H> bin/vnjy- 1 '* 



lim 

n— yoc 

uniformly on compact sets in [0,oo), where 

1 



y 



(2.5) 



b(y) := inf 



>y 



P(H > x) 

Assumption E: For any sequence (v n ) satisfying (I2.4D . 



lim lim sup 



n 



—P(H > b(n/v r . 



^ = 0. 

s 



, Xr, 



The main result of this section, Theorem 12.11 describes the second order 
behavior of h(k n ,n), where h(-,-) and k n are as defined in (jl.ip and (|1.4p 
respectively, under the assumptions A-E. Of course, these assumptions are 
hard to check in practice. However, in Section [3j we show that most of these 
can be verified if the tail of H is second order regularly varying and some 
additional conditions are satisfied. One could thus state the hypothesis of 
Theorem 12. II in terms of the second order regular variation. The only reason 
why we decided not to do that is the following. The simplest example of 



4 



A. CHAKRABARTY 



a distribution with a regularly varying tail is a Pareto, w hich is known t o 
not satisfy the second order regular variation as defined in iResnick (|2n07h . 
Hence, if Theorem 12. II is stated in terms of second order regular variation, it 
will not entail simple examples of regularly varying distributions like Pareto, 
which clearly satisfy the assumptions A, D and E. 

Theorem 2.1. Under assumptions A,B,C,D and E, 



(2.6) 



'k n \ h(k n ,n) - - 
a 



The following is a brief outline of how we plan to prove this. Define 



Vn 



3=1 
n 

£l(X,-> 7 *(i)), 



Note that 



ki- 



ll 



Since we are dealing with a random sum, a natural way of proceeding is 
conditioning on the number of summands. However, conditioning on V n or 
k n destroys the i.i.d. nature of the sample. Hence, we condition on U n = u n , 
where (u n ) is any sequence of integers satisfying u n ~ nP(H > jM n ). 
Lemma [2. II is a general result, which allows us to claim weak convergence of 
the unconditional distribution based on that of the conditional distribution. 
Clearly, by conditioning on U n , h(k n ,n) becomes the Hill statistic with a 
deterministic k applied to a triangular array. The second order behavior 
of that is studied in Lemma 12.31 In view of Lemma 12.11 this translates 
to second order behavior of (the unconditional distribution of) h(k n ,n). In 
order to argue the claim of Theorem 12 .1\ all we need is showing that h(k n , n) 
and h(k n ,n) are not very far apart, and that is done in Lemma 12.41 For 
Lemma 12.31 and Lemma 12.41 we need that the tail empirical process, after 
suitable centering and scaling, converge to a Brownian Motion. This has 
been showed in Lemma 12.21 



Lemma 2.1. Suppose that (B r , 
variables satisfying 



: n > 1) is a sequence of discrete random 

Bn P 



for some deterministic sequence (b n ). Assume that (A n : n> 1) is a family 
of random variables such that whenever b n is any deterministic sequence 
satisfying b as n — > oo and P(B n = b n ) > 0, 

(2.7) P(A n < -\B n = b n ) F(-) , 
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for some c.d.f. F. Then A n =^> F . 

Proof. It suffices to show that every subsequence of (A n ) has a further sub- 
sequence that converges weakly to F. Since every sequence that converges in 
probability has a subsequence that converges almost surely, we can assume 
without loss of generality that 

(2.8) — ► 1 a.s. . 

On 

Fix a continuity point x of F and define a function f n : R — > [0, 1] by 

C P(A„<x,B„=u) -r p / R _ \ n 
f n (u) = ) P(B n =u) . Jt - u) > U 

I 0, otherwise. 

Clearly, for all n > 1, 

P(A n <x) = Ef n (B n ) . 
By ([277]) and fl278]), it follows that 

fn(B n ) — > F(s) a.s. . 
By the bounded convergence theorem, it follows that 

lim Ef n (B n ) = F(x) , 

and this completes the proof. □ 
Throughout this section, assumptions A, B, C, D and E will be in force. 
Lemma 2.2. Suppose that (u n ) is a sequence of integers satisfying 

(2.9) u n ~ nP(H > 1 M n ) , 
and let 

(2.10) v n := [n^u^-Un, 

(2.11) M n ■= jM n . 

Let for n > 1, Y n> i, . . . ,Y n ^ n be i.i.d. with c.d.f. F n , defined as 

F n (x) := P(H<x\H<M n ). 

Then, 

(, n-u n \ 

in -D[0,oo), where -D[0,oo) is endowed with the topology of uniform conver- 
gence on compact sets and W is the standard Brownian Motion on [0,oo). 
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Proof. For simplicity sake, denote w n := n — u n . It is easy to see by assump- 
tions B and C that 

(2.13) 1 < w n P(H > M n ) < < • 

Let (Fj : i > 1) be the arrivals of a unit rate Poisson Process. Define 

Ms) ■= ^.(r 1 ^)), 

where G := 1 — G for any function G. By the discussion on page 24 in 
Resnickl (|2007T ). it follows that 



(2.14) 

It follows by (pJ3|) that 



111 

lim > b(w n /v n )) = 1 • 

n->-oo t? n 



7/ j ~ 

lim — P(H > Mn) = . 



This in conjunction with (|2.14p implies that 

b(w n /v n ) = o{M n ) . 
It is easy to see that v n satisfies (|2.4p . Hence, for n large enough, 



1 



^P (h > 8-^ a b(w n /v n )) - ^P{H > M 



P(H < M n ) 
-s + sP{H > M n 

and hence in view of Assumption D and (|2.13p . it follows that for < T < oo, 



(2.15) 
Also note that, 



lim sup 



■F n { S - 1/a b(w n /v n )) 



0. 



sup 

0<s<T 



cf> n (s) - ^F n ( s ~V<*b(w n /v n )) 



r 



ui n +l 



W, 



W, 



F n {T-^ a b(w n /v n )) 



= o p ( w ;y 2 )0{\) 
= o p {v- l i 2 ). 

This in conjunction with (|2.15p shows that 



(2.16) 



V n {4>n{s) ~ s) > 
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in D[0, oo). Recall that since 1 <C v n <C w n , in D[0, oo), 

v n s) -s) => W(s) ; 



/ 1 w„ 

\ 2=1 



see (9.7), page 294 in Resnickl (|2007l ). Hence, it follows by the continuous 
mapping theorem and Slutsky's theorem that 



/ 1 Wn 

(2.17) -E 1 ^ 

< V n (j) n (s)) 

\ Vn i=l 



Ms) => W{s) 



in D\0, oo). B y similar arguments as those in the proof of Theorem 9.1 in 
Resnickl (|200<n . it follows that 



^ 6 Y wn ,i/b{w n /v n ){y 1/a ,oo] = J2l{Ti< v n cp n {s)) ■ 



i=l 



8=1 



This along with (LTTHD and (bTTTj) shows (|2T2jh 

Lemma 2.3. Lei (n n ) 6e a sequence of integers satisfying (|2.9j) and /ei (iv, 
and (M n ) 6e as defined in (I2.10p and (j2.11[) respectively. Then, 



□ 



Y(n-u n ,i) 1 



iV 



(n— w n ,u n ) 



a 



> 9 1' 

cr 



where Y( n ,i) > • • • > ^(n,n) are ^ e 0? "der statistics of Y n i, . . . , Y^ in , and i/ie 
latter is as defined in Lemma \2.2l 



Proof. Once again, let us denote w n '■= n — Un- An application of Vervaat's 
lemma (Proposition 3.3 in Resnick ( 20071 )) to (|2.12p shows that 



(2.18) 



1 (w„,v„) 



1 



. b(w n /v n ) 

jointly with ()2.12p . This in particular, shows that 



-W(l) 



/b(w n /v n ){x,Oo) - X 



-a I 



' b(w n /v n ) 



^(W(x- a ),l), 

in .D(0,oo] x R, jointly with (|2.18j) . where -D(0,oo] is also endowed with 
the topology of uniform convergence on compact sets. Using the continuous 
mapping theorem, it follows that 



1 



w n y—a 
Vn < > Ov IY, . [X, OO — X 

VU i = l 



b(w n /v n )- 



(2.19) 



W(x~ a ) 
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i n D(0 , oo], jointly with p. 181) . As in the proof of Proposition 9.1 in iResniclj 
(2007), we shall apply the map ip from D(0, oo] to R, defined by 



ds 



to conclude that 



I 1 Vn 
(2.20) yfal — ^ log 



Y 



(w„,i) 



1 



Y, 



(w n ,v n ) 



K ... , : J Y {w n ,v„) ab(w n /v n )- 
jointly with (|2.18p . This implies that 



W(x~ 



dx 



v^j^-Eiog^--}^ rw{ X -«)-- l -w{i) 

\ y n ~ Y ( n>Vn) a) Ji x a 

as desired. Thus, it suffices to show (|2,20p . 

To that end, note that for 1 < T < oo, the map ipTi defined by 



md ■= / m 



ds 



is continuous and has compact support. Also, as T — > oo, 

Mw(s- a )) => ^(W(s- a )) . 

Some calculations will show that if) applied to the left hand side of (|2,19p 
gives the left hand side of (|2,20p . Thus, all that needs to be done is justifying 
the application of tj) to ([2.19p . and for that, it suffices to check that for all 
e > 0, 



lim limsupP 

T-Kx> >oo 



,x, oo 



Y 



-x 



(l»n,%) 



dx 

— > e 

x 



b(w n /v n ) a 

Note that on the set {Y( Wn , Vn )/b(w n /v n ) > 1/2}. 



/•OO ^ w n 

/ / Sy ./y, ,(x,Oo] 

Vn i = l 



0. 



-a (w„,v n ) 



TY (-w„,v n )/H w ™/ v ™) 



b(w n /v n ) a 



c/.r 

x 



du 
u 



< 



POO w n 

, ~Y1 S Y«, n ,iMv>n/v n ) («. OO] - «" 



it 
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Since P[Yr WnjV \ /b(w n / v n ) < 1/2] goes to zero, it suffices to show that 



(2.21) 
Clearly, 



lim lim sup P 

T— >oo n— >oo 



du 



—J2 S y^^/K^/v n )(u,00] 
JT/2 v n ,_ , 



> e 



IT/2 -,. i=1 
0. 



JT/2 V n ^ Y — 



/b(w n /v n ){U,00\ - U 



< 



T/2 
Wn 



i=l 



du 
u 



E^„,AKK)( U ' °] ~ ^Z F n (ub(w n /v n )) 

du 



1 i=i 
oo 



du 
u 



v n JT/2 

oo 



\F n (ub(w n /v n )) - P(H > ub(w n /v n ))\ 



u 



T/2 



•iv 



—P (H > ub(w n /v n )) - u~ a 



v. 



du 
u 



POO -y Wn W - 

/ ~Y1 6 y^/Kwn/vn)( U ^ OO] - ~^ F n {ub{ Wn /v n )) 
JT/2 v n . =1 V n 



du 
u 



+ - 



w„ 



Mn/b(w n /v„) ^ 

\F n (ub(w n /v n )) - P(H > ub{w n /v n ))\ 



T/2 

oo 



U 



+^ r p{H> U ) 

v n J M n 



du 

U 



+ 



UJ 

— P (H > ub(w n /v n )) — u 



IT/2 

--: h + h + h + h. 



du 
u 



Since v n is defined by (|2.10p . (12, 4|) holds. By Assumption E, it follows 
that 

lim lim sup ^Jv^l^ = . 

T^oo n _s.oo 

Karamata's theorem (Theorem VIII. 9.1, page 281 in iFellerl (|l97lh ) implies 
that 



h = 0[^P{H>M n ) \ =o(n 



Vn 



1/2' 

'n I J 



the second equality following from (|2.13p . For I2, note that 

F n (ub{w n /v n )) -P(H> ub{w n /v n )) 
P(H > M n )P (H < ub(w n /v n )) 
P(H < M n ) 
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Also, it is easy to see from assumption C that 



(2.22) 
Thus, 



w n P(H>M n ) I M n 
hm — log < — — - 

n^oo ^Jv n 0(w n /V n ) 



I 2 = ol^P(H>M n )log-^^ 
\ v n b{w n /v n ) 



0. 



O V, 



,-V2 



n 



the second equality following from (j2.22|) . 
Thus, all that remains is showing 



(2.23) 

Notice that 
E 



lim limsupPfy^/i > e] = . 



1=1 



UJ - 

—F n (ub(w n /v n )) . 



Letting C to be a finite positive constant independent of n, whose value may 
change from line to line, 



< 



-E{h 



E 



T/2 



Y Wnt i/b(w n /v n ) 



i=l 



POO 1 w n 

< CJv^ \ Var — V<5 

J T/2 [Vn 



1/2 



F n (ub(w n /v n )) 



du 
u 



< C 
■ C 



'Wr, 



oo 
T/2 



Yw n ,i/b( w n/v n ) 



F n (ub(w n /v n )) l/2 — 

T/2 u 



du 
u 



'Wr, 



P(H >ub(w n /v n )) 1/2 



du 



oo. By (|23I) . 



By (|2.14j) . the integrand clearly converges to u a l 2 as n 
the integrand is bounded above by 

P(H > b(w n /v n )) \ ' 

which by the Potter bounds (Proposition 2.6 in Resnick ( 20071 )) is bounded 
above by 2u _Q//3 for n large enough. An appeal to the dominated conver- 
gence theorem shows (|2.23p and thus completes the proof. □ 



Lemma 2.4. As n 

(2.24) 



oo, 



n $yh(k n ,n) - h(k n ,n)\ 
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Proof. We start with showing that 
(2.25) \/ k 



km 



0. 



In the proof of Theorem 3.2 in Chakrabarty and Samorodnitsky (2009), it 
has been shown that under Assumption B, 



(2.26) 
(2.27) 

(2.28) 



and 



nP{H > */M n 

Vn 

nP(H > */M n 

kn 



nP(H > 7 M„> 
In view of (|2.28p . it suffices to show that 



1, 
1, 
1 . 



Note that, 



n l l 2 P(H > M n f/ 2 



kn 

k n 



^ - 1 



0. 



n 1 



i-pu% + 1 ~ K 



< K < n 1 -^ + 1 



n 



< 



n 



V, 



< 



-fug ' 



and 



^Vi + 1 



n 



n 



^Vi + n^ui + l 



n 



n 1 "^ + 1 



ni-Pug^-Pug + 1) 
O p (n~ l P{H > Mn)~t 

- l ' 2 P{H> M n )-P' 2 



o n n 



the equality in the second line following from (|2.26p and (|2.27p . and that in 
the third line following from Assumption B. Thus, it suffices to show that 



n 1/2 P(H > M n 



Vn 



1 



0. 



By the mean value theorem, it follows that as x — > 1, 

X P -l = 0(\x-l\). 

Hence, in view of the fact that V n /U n converges to 1 in probability, it suffices 
to show that 



n 



V 2 P{H > M n fl 2 ( £L 



V, 



1 



0. 
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Using (I2.26P once again, all that needs to be shown is 

V n -U n = o p (n-^P(H > M n y(i-^) . 

Note that on the set {M n < Xm < M n (l + e n )}, where e n is chosen to 
satisfy Assumption A, 

n 

< U n - V n < 1 (7^n < Xj < 7 M n (l + e n )) =: T n . 

3=1 

Thus, it suffices to show that 

(2.29) lim P(X (1) < M n {l + e n )) = 1 , 

(2.30) lim P(X m > M n ) = 1 , 

(2.31) and T n = o p (n~^ 2 P(H > Mn)~ {1 ~ P/2) 



For (|2.29p . note that as n — > oo, 

P(X (l) < M n (l + e n )) = (1 - P(tf > M n )P(L > e„M„)) n — > 1 , 

the convergence following from (|2.2p in Assumption A. This shows (|2.29j) . 
For $L30\) . observe that 

P(X (1) < M n ) < (1 - P{H > M n )) n . 

By Assumption B, the right hand side converges to zero, and hence (|2.30p 
holds. To show (|2.31|) . note that 

Var(T„) < E(T n ) = np n , 

where 

p n := P( 7 M„ < X 1 < 7 (1 + e n )M n ) . 
In view of Assumption C, for (|2.3ip . it suffices to show that 

(2.32) p n = o{P{H > M n ) 2 ~^ . 

For n large enough so that 7 (1 + e n ) < 1, 

Pn = P(H> 7 M n )-~/- a M- a (l + e n )- a l ( 7 M n (l + e n )) 
= j- a M~ a l ( 7 M n (l + e„)) {1 - (1 + £„)-"} 

+ P { H> lM n){l- l ^ Mn{l + £n)) 



KlM n ) 

The first term on the right hand side is clearly 0(e n P(H > M n )), which 
by (ETLj) . is o (P(if > M n ) 2_/3 ). By (ET3j) . it follows that the second term is 
also o (P(H > M„) 2 ~^). This shows (|2.32|) . and thus completes the proof 
of (12351) . 

Next, we show that for all rj £ R, as n — )• oo, 
(2.33) /~ lQg (n,[wy 2 ]) 

X (n,fc„) « 
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Let (u n ) be a sequence of positive integers satisfying (|2.9p For n large enough 
so that 1 < u n < [n 1 -^] < n and 1 < u n < [n^^Un) + ^[n 1 -^] 1 / 2 < n, 



the conditional distribution of [X,r \,X n ? ri/2 n given that U n - u !t is 
same as the (unconditional) distribution of 



Y {n~u n \n^-fiui]-u n y Y {n-u n \n^-Pui\+r 1 [n^-fiuiY^-u n ) ) ' 



where \X{n,j) : 1 < J < ra} is as defined in Lemma [2.21 with M n as in (|2.1ip . 
Define v n as in (|2,10p By Lemma 12.21 it follows that 



W(y) 



in D[Q, oo). Using Vervaat's lemma, it follows that 



(2.34) 



Y 



(n-u n ,[v n x\) 



b((n - u n )/v n ) 
in D[Q, oo). From here, we conclude that 

^(n— Un,[«nS?i]) 

6((n - u n )/v n ) 



-W{x) 



Y ( 



(n—Un,Vn) 



b((n - u n )/v n ) 



=>(-wm-w(l))1 

where s n := l+r/v^ 1 ^ 1- ' 3 ^] 1 / 2 . Since the limit process is C[0, oo) x C[0, oo) 
valued , this can be done using Skorohod's Theorem (Theorem 2.2.2 in lBorkar 
Jsing the Delta method with x h >• — — logx, it follows that 



w n < log 



Y(n—Un,[v n Sn\) . 1 



b((n - u n )/v n ) a 



+ -logS n ^ , 0^1og 



1/ 



(n-u„,t„) 



6((n - u n )/v n ) 



Since, 



it follows that 



v n loe 



Y, 



-W(l),-W(l) I . 
a a 



lim ^/t^logs n =77, 



(n— Mnjn 1 Pun]—U n ) 



(n-u n ,[ni-Pu%,]+v[n l -euP]i/ 2 -u n ) 



V 
a 



What we have shown is that whenever (u n ) is a sequence satisfying (j2.9[) . 
the conditional distribution of the left hand side of (|2.33j) given U n = u n 
converges weakly to —rj/a. By an appeal to Lemma I2TTI this shows (|2.33[) . 
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Coming to the proof of (|2.24j) . note that 
\h{k n , n) — h(k n , n) 



1 



kn 



X, 



E^^-E 1 * 



x, 



(0 



i=l 



Clearly, 



: A + B + C. 
C 



1 1 



(k n ) i=l 

k'n 



X 



(K) 



kn , X (k n ) 

+ log ■ 



X 



kn k n / 



E 1 ^ 



X, 



(<) 



X 



(kn) 



k„ ( 1 - y j h(k n ,n) -A 0, 



the convergence in probability following from (|2.25|) and the fact that 

h(k n ,n) — > 1/a , 

which has been shown in lChakrabartv and Samorodnitskv (2009). For show- 
ing that B — > 0, fix e > and let r\ := ea/6. Note that 

P(\B\ > e) 



X 

k„ log ^ fc "~^" /2 ) > 3 1 

(kn+rikn 2 ) 

By ([OS]) and (pOHj) . it follows that 5 -A 0. Since for < e < 1, 



< i 3 


^>2 


+ P 


"\J k n 


kn .. 


> 7] 


+ P 




kn 






h 





P{\A\ >e)<P 



kn 



> e 



+ P 



s x 

(kn~\~k n ) 



it is immediate that ^4 — > 0. This completes the proof. 

Proof of Theorem \2.1\ In view of Lemma 12.41 it suffices to show that 

(2.35) Jl n (h{k n ,n)--\^N[^\ 



□ 



Define 



Si 



o 



E lo sy— 

i = l (fen) 



k'n 

i=Z7 n +l (M 
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and note that on the set {U n < k n }, 

1 

h(k n ,n) = ~-(Si + S 2 ) ■ 



15 



Let u n be a sequence of integers satisfying (|2.9p and define v n and M n as in 
(|2.10p and (|2.11|) . For n large enough, note that 



[S 2 \U n = u n ] = ^log 



Y 



(n-u n ,i) 



(n—u n ,v n ) 



where {Yr n j\ : 1 < j < n} is as defined in the statement of Lemma [231 By 
Lemma 12^31 it follows that 



y/Vn~ ( —,§2 - - 

. v n a 



N ( 0, -j ] . 



This along with the fact that 



shows that 



/.„ | J-S 2 - — 



^2 Un 



O p (l)o(l), 



Since this is true for all sequence of integers (u n ) satisfying (|2.9p . by Lemma 
12. H it follows that 



'*Vi ( j~S 2 - - 



N 0, 



1 



On the set {1 < < 2M n } 



Si < ^/ w log(2M„ 



= O p (n 1 / 2 P(H > Mn) 1 "^ 2 log M n ^ 
= o p (l). 

Since the probability of that set converges to one, it follows that 

4ao. 



This completes the proof. 



□ 
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3. Second order regular variation 

In this section, we show that if the tail of H is second order regularly 
varying, and L is sufficiently light-tailed, then the hypotheses of Theorem 
12.11 hold. By the tail being second order regularly varying, we mean that 
there is a function A : (0, oo) — > (0, oo) which is regularly varying with 
index pa where p < 0, such that 

P(H>tx) - a 

P(H>t) ~ x - 1 

(3.1) lim F{H>t >. . = x 



t-too A(t) p/a 
for all x > 0; see (2.3.24) in |de Haan and Ferreiral (|2006l ). 
Theorem 3.1. Suppose that 
(3.2) max(l - l/a,0) < /3 < 1, 

all moments of L are finite, M n satisfies assumptions B and C, and the tail 
of H is second order regularly varying so that the second order parameter p 
satisfies 

„ 1-/3 

Then, (|2.6p holds. 

Proof. In view of Theorem 12.11 it sumces to check that a ssump tions A, D 



and E hold. By Theorem 2.3.9 in Ide Haan and Ferreiral (120061 ). it follows 



that given e, 5 > 0, there exist to > 1 such that whenever t, tx > to, 
(3.3) 



P(H>tx) _ a 

P(H>t) X x _ a xP -1 



A(t) p/a 



< ex~ a+pa m&x(x S ,x- S ) 



Note that (|3.3[) holds with a possibly different A(t) from that in (|3.ip . How- 
ever, this A is also regularly varying with index pa. For the rest of the proof, 
by A(-), we shall mean the one for which (13. 3D holds. 
We start with showing that 

(3.4) ^v-=o(A(b(n/v n ))- 1 ) , 

whenever v n is a sequence satisfying (|2.4p . Let 

r, := -p(3 - (1 - P) . 

The upper bound on p implies rj > 0. Note that A(b(-)) varies regularly 
with index p and n/v n ~ P(H > "yM n )~ 13 . Thus, there is a slowly varying 



HILL ESTIMATOR 



17 



function / so that 



A(b(n/v n ))' 



l(M n )P(H > M n yv 
> P(H > Mn)^ 13 

nV 2 P(H > M n fl 2 
n^pjH > M n y-P/ 2 
» n l ' 2 P{H > M n f/ 2 

the inequality in the second last line following from Assumption C. This 
shows fljlD . 

Now, we show that assumptions D and E hold. Let 

e n := A(b(n/v n )) A (1/2) . 

Clearly 1 > e n > for all n. Recall from ([23]) that z < b(y) iff P(H > 
z ) < V- Thus, 

1 n 1 

< — < 



P (H > (1 - e n )b{n/v n )) v n ~ P(H> b{n/v n )) 

Let 5 > be such that pa + 5 < 0. Let to be such that whenever t, tx > to, 
(|3.3p holds with e = 1 and this 5. Fix < T < oo. Let iV be such that 
for n > N, b{n/v n ) > 2io V t^jT . Thus, there is C < oo, whose value may 
change from line to line, depending only on T, so that for n > N and x > T, 



P(tf > 6(n/v„)x) 



< Ci(5(nK))i- a+pa+i < CA{b{n/v n ))x 



P(H > b(n/v n )) 
the second inequality following since pa + 5 < 0, and similarly 

P(i? > b(n/v n )x) 



sup 

T<a;<oo 



P(H > (1 - e n )6(nK)) 



x~ a (l 



< CA{{l-e n )b(n/v n )) 

< CA(b(n/v n ))x~ a . 



Since 



(1 



1 = 0{e n ) = 0(A(b{n/v n ))) , 



it follows that there is (a possibly different) C < oo so that for all x > T, 



P(H > b(n/v n )x) - x~ 



< CA(b(n/v n ))x- a . 



This in view of (|3.4p shows that assumptions D and E hold. 

Finally, we show that Assumption A holds. By (|3.2p . it follows that 



l-a(l-p) >0. 
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Let p > be such that 

a(l-/3) 



V 

This choice of p ensures that 



< l-a(l-/3). 



(3.5) ^H <1 - a ( 1 -^l 

p V v 

Note that xP(H > is regularly varying with index l—a(l—/3—l/p) 

and P(.ff > x)~( 2 ~^)/ p is regularly varying with index a(2 — 0)/p. Thus, by 
(13.511 it follows that 



M n P{H > M n ) 1 -^- 1 / p > P(H > M n )- {2 -^ p > n 1 '* , 
the last inequality following from Assumption C. Thus 

n l / p P{H > M n ) l / p M- 1 <C P(H > Mn) 1 -! 3 . 
Let (e n ) be such that 

i l ' p P{H > M n ) l ' p M- 1 < e n < P(# > M n ) 1_/3 . 



n 



Clearly, (|2. 1 j) holds with this choice of (e n )- For ()2.2|) . note that since 
SLP < oo, 

nP(# > M n )P(L > e n M n ) = O (nP(H > M n )e- p M~ p ) = o(l) . 



This shows (I2.2p . Finally, for (I2.3p . choose <5 > so that pa + 5 < 0. Let to 
be such that (13.311 holds with this 5 and e = 1. Thus, as n — > oo, 



l(rfM n (l + e n )) 



l(jM n ) 



O 



P(H>jM n (l + e n ))_ 



P(H > 1 M n 
= O (A(M n )M- a+pa+s 

= o (>(# > M^) 1 ^ 

the last step following from the observations that 

P(H > M n y^~^ A(M n )M- a+pa+& 



{M~ a P(H > M n )-^-^MP a+5 A(M n 



and that each of the three terms on the right hand side go to zero. This 
shows that Assumption A holds and thus completes the proof. □ 
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4. HOW TO USE THIS IN PRACTICE 

While the assumptions A, D and E mentioned in Section [2] can be verified 
by assuming the second order regular variation and that all moments of L are 
finite, one still needs a way to check assumptions B and C in practice. Sta- 
tistical tests for checking Assumption B have been discussed in Chakrabarty 
and Samorodnitsky (2009). For checking Assumption C, which means that 
M n grows fast enough, one can use the facts that 

P i 

M n 

and 

E"=il(Xj>7M n ) p 

nP{H > 1 M n ) ~^ 

for < 7 < 1. These facts have been proved in Chakrabarty and Samorod- 
nitsky (2009). An immediate consequence of these is that if Assumption C 
holds, then 

n (-E 1 ( X i>^(i))j (logX (1) ) 2 Ao. 

Thus, a natural thing to do is to choose j3 (if possible) such that the above 
is satisfied. 

We would like to mention at this point that from the point of view of 
using Theorem 13. 1\ some issues remain unsorted. One of them is how does 
one ensure (13.21) . A naive method would be to first get a "rough" estimate 
of a and then choose /3 to satisfy the above. However, it is not clear at 
the moment that this is going to work. The other unsorted issue is that 
of checking the second order regular variation in the data and that p < 
— (1 — /?)//?. But then part of this is also a criticism for the Hill statistic 
applied to untruncated data; the same is known to be asymptotically normal 
only under some form of second order regular variation. 
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